Using Inverted Indices for Accelerating LINGO Calculations

نویسندگان

  • Thomas Greve Kristensen
  • Jesper Nielsen
  • Christian N. S. Pedersen
چکیده

The ever growing size of chemical databases calls for the development of novel methods for representing and comparing molecules. One such method called LINGO is based on fragmenting the SMILES string representation of molecules. Comparison of molecules can then be performed by calculating the Tanimoto coefficient, which is called LINGOsim when used on LINGO multisets. This paper introduces a verbose representation for storing LINGO multisets, which makes it possible to transform them into sparse fingerprints such that fingerprint data structures and algorithms can be used to accelerate queries. The previous best method for rapidly calculating the LINGOsim similarity matrix required specialized hardware to yield a significant speedup over existing methods. By representing LINGO multisets in the verbose representation and using inverted indices, it is possible to calculate LINGOsim similarity matrices roughly 2.6 times faster than existing methods without relying on specialized hardware.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two new three and four parametric with memory methods for solving nonlinear ‎equations

In this study, based on the optimal free derivative without memory methods proposed by Cordero et al. [A. Cordero, J.L. Hueso, E. Martinez, J.R. Torregrosa, Generating optimal derivative free iterative methods for nonlinear equations by using polynomial interpolation, Mathematical and Computer Modeling. 57 (2013) 1950-1956], we develop two new iterative with memory methods for solving a nonline...

متن کامل

Mathematical Model of Financial Investment Risk

This paper establishes the income and risk model in financial investment based on multi-objective programming theory, aiming to analyze the relationship between risk and return in financial investment and discuss the relationship between the risk the investor shall bear and decentralization degree of investment project. MATLAB software is used to analyze the investor’s optimized return under fi...

متن کامل

Accelerating Spatial Join Operations using Bit-Indices

Spatial join is a very expensive operation in spatial databases. In this paper, we propose an innovative method for accelerating spatial join operations using Spatial Join Bitmap (SJB) indices. The SJB indices are used to keep track of intersecting entities in the joining data sets. We provide algorithms for constructing SJB indices and for maintaining the SJB indices when the data sets are upd...

متن کامل

Blocking LINGO-1 function promotes retinal ganglion cell survival following ocular hypertension and optic nerve transection.

PURPOSE LINGO-1 is a functional member of the Nogo66 receptor (NgR1)/p75 and NgR1/TROY signaling complexes that prevent axonal regeneration through RhoA in the central nervous system. LINGO-1 also promotes cell death after neuronal injury and spinal cord injury. The authors sought to examine whether blocking LINGO-1 function with LINGO-1 antagonists promotes retinal ganglion cell (RGC) survival...

متن کامل

Inverted Pendulum Control Using Negative Data

   In the training phase of learning algorithms, it is always important to have a suitable training data set. The presence of outliers, noise data, and inappropriate data always affects the performance of existing algorithms. The active learning method (ALM) is one of the powerful tools in soft computing inspired by the computation of the human brain. The operation of this algorithm is complete...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 51 3  شماره 

صفحات  -

تاریخ انتشار 2011